<!DOCTYPE html>
<html xmlns:MadCap="http://www.madcapsoftware.com/Schemas/MadCap.xsd" class="_Skins_HTML5___Top_Navigation" lang="en-us" xml:lang="en-us" data-mc-search-type="Stem" data-mc-help-system-file-name="Default.xml" data-mc-path-to-help-system="../../" data-mc-has-content-body="True" data-mc-target-type="WebHelp2" data-mc-runtime-file-type="Topic;Default" data-mc-preload-images="false" data-mc-in-preview-mode="false" data-mc-toc-path="Evaluating Experiments with Inference Testing">
    <head>
        <meta name="viewport" content="width=device-width, initial-scale=1.0" />
        <meta charset="utf-8" />
        <meta http-equiv="X-UA-Compatible" content="IE=edge" />
        <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
        <link href="../../Skins/Default/Stylesheets/Slideshow.css" rel="stylesheet" type="text/css" data-mc-generated="True" />
        <link href="../../Skins/Default/Stylesheets/TextEffects.css" rel="stylesheet" type="text/css" data-mc-generated="True" />
        <link href="../../Skins/Default/Stylesheets/Topic.css" rel="stylesheet" type="text/css" data-mc-generated="True" />
        <link href="../../Skins/Default/Stylesheets/Components/Styles.css" rel="stylesheet" type="text/css" data-mc-generated="True" />
        <link href="../../Skins/Default/Stylesheets/Components/Tablet.css" rel="stylesheet" type="text/css" data-mc-generated="True" />
        <link href="../../Skins/Default/Stylesheets/Components/Mobile.css" rel="stylesheet" type="text/css" data-mc-generated="True" />
        <link href="../../Skins/Fluid/Stylesheets/foundation.6.2.3.css" rel="stylesheet" type="text/css" data-mc-generated="True" />
        <link href="../../Skins/Fluid/Stylesheets/Styles.css" rel="stylesheet" type="text/css" data-mc-generated="True" />
        <link href="../../Skins/Fluid/Stylesheets/Tablet.css" rel="stylesheet" type="text/css" data-mc-generated="True" />
        <link href="../../Skins/Fluid/Stylesheets/Mobile.css" rel="stylesheet" type="text/css" data-mc-generated="True" /><title>Streaming Inference Example</title>
        <link href="../Resources/Stylesheets/Styles.css" rel="stylesheet" type="text/css" />
        <script src="../../Resources/Scripts/custom.modernizr.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/jquery.min.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/require.min.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/require.config.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/foundation.6.2.3_custom.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/plugins.min.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/MadCapGlobal.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/MadCapDom.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/MadCapUtilities.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/MadCapXhr.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/MadCapTextEffects.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/MadCapSlideshow.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/MadCapFeedback.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/MadCapDefault.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/MadCapHelpSystem.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/MadCapToc.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/MadCapToc.Breadcrumbs.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/MadCapToc.MiniToc.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/MadCapToc.SideMenu.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/MadCapIndex.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/MadCapGlossary.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/MadCapParser.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/MadCapSearch.js" type="text/javascript">
        </script>
        <script src="../../Resources/Scripts/MadCapTopic.js" type="text/javascript">
        </script>
    </head>
    <body>
        <div class="foundation-wrap off-canvas-wrapper">
            <div class="off-canvas-wrapper-inner" data-off-canvas-wrapper="">
                <aside class="off-canvas position-left" id="offCanvas" data-off-canvas="" data-position="left" data-mc-ignore="true">
                    <ul class="off-canvas-drilldown vertical menu off-canvas-list" data-drilldown="" data-mc-back-link="Back" data-mc-css-tree-node-expanded="is-drilldown-submenu-parent" data-mc-css-tree-node-collapsed="is-drilldown-submenu-parent" data-mc-css-sub-menu="vertical menu slide-in-left is-drilldown-submenu" data-mc-include-indicator="False" data-mc-include-icon="False" data-mc-include-parent-link="True" data-mc-include-back="True" data-mc-defer-expand-event="True" data-mc-expand-event="click.zf.drilldown" data-mc-toc="True">
                    </ul>
                </aside>
                <div class="off-canvas-content inner-wrap" data-off-canvas-content="">
                    <div data-sticky-container="" class="title-bar-container">
                        <nav class="title-bar tab-bar sticky" data-sticky="" data-options="marginTop:0" style="width:100%" data-sticky-on="only screen and (max-width: 1000px)" data-mc-ignore="true">
                            <div class="middle title-bar-section outer-row clearfix">
                                <div class="menu-icon-container relative clearfix">
                                    <button class="menu-icon" data-toggle="offCanvas"><span></span>
                                    </button>
                                </div>
                            </div>
                            <div class="title-bar-layout outer-row">
                                <div class="logo-wrapper"><a class="logo" href="index.htm" alt="Logo"></a>
                                </div>
                                <div class="navigation-wrapper nocontent">
                                    <ul class="navigation clearfix" data-mc-css-tree-node-has-children="has-children" data-mc-css-sub-menu="sub-menu" data-mc-expand-event="mouseenter" data-mc-top-nav-menu="True" data-mc-max-depth="3" data-mc-include-icon="False" data-mc-include-indicator="False" data-mc-include-children="True" data-mc-include-siblings="True" data-mc-include-parent="True" data-mc-toc="True">
                                        <li class="placeholder" style="visibility:hidden"><a>placeholder</a>
                                        </li>
                                    </ul>
                                </div>
                                <div class="nav-search-wrapper">
                                    <div class="nav-search row">
                                        <form class="search" action="#">
                                            <div class="search-bar search-bar-container needs-pie">
                                                <input class="search-field needs-pie" type="search" placeholder="Search" />
                                                <div class="search-filter-wrapper">
                                                    <div class="search-filter">
                                                        <div class="search-filter-content">
                                                            <ul>
                                                                <li>All Files</li>
                                                            </ul>
                                                        </div>
                                                    </div>
                                                </div>
                                                <div class="search-submit-wrapper" dir="ltr">
                                                    <div class="search-submit" title="Search">
                                                    </div>
                                                </div>
                                            </div>
                                        </form>
                                    </div>
                                </div>
                            </div>
                        </nav>
                    </div>
                    <section class="main-section">
                        <div class="row outer-row sidenav-layout">
                            <div class="sidenav-wrapper">
                                <div class="sidenav-container">
                                    <ul class="off-canvas-accordion vertical menu sidenav" data-accordion-menu="" data-mc-css-tree-node-expanded="is-accordion-submenu-parent" data-mc-css-tree-node-collapsed="is-accordion-submenu-parent" data-mc-css-sub-menu="vertical menu accordion-menu is-accordion-submenu nested" data-mc-include-indicator="False" data-mc-include-icon="False" data-mc-include-parent-link="False" data-mc-include-back="False" data-mc-defer-expand-event="True" data-mc-expand-event="click.zf.accordionMenu" data-mc-toc="True" data-mc-side-nav-menu="True">
                                    </ul>
                                </div>
                            </div>
                            <div class="body-container" data-mc-content-body="True">
                                <h1>Streaming Inference Example</h1>
                                <h2>Example Flow</h2>
                                <p>Following is the basic task flow for this example.</p>
                                <ol>
                                    <li value="1">The user has saved a trained Tensorflow Serving compatible model.</li>
                                    <li value="2">The user will be sending data for inference in JSON format, or in binary format using gRPC API.</li>
                                    <li value="3">The user runs <span style="font-family: 'Courier New';">nctl predict launch</span> command.</li>
                                    <li value="4">The user sends inference data using the <span style="font-family: 'Courier New';">nctl predict stream</span> command, Tensorflow Serving REST API, or Tensorflow Serving gRPC API.</li>
                                </ol>
                                <h2>Tensorflow Serving Basic Example</h2>
                                <h3>Launching a Streaming Inference Instance</h3>
                                <p>Basic models for testing Tensorflow Serving are included in <a href="experiment.htm">https://github.com/tensorflow/serving</a> repository. This example uses the <span style="font-family: 'Courier New';">saved_model_half_plus_two_cpu</span> model for showing streaming prediction capabilities.</p>
                                <p>In order to use that model for streaming inference, perform following steps:</p>
                                <ol>
                                    <li value="1">Clone <a href="">https://github.com/tensorflow/serving</a> repository: <br /><span style="font-family: 'Courier New';">git clone https://github.com/tensorflow/serving</span></li>
                                    <li value="2">Perform step 3 or step 4 below, based on preference.</li>
                                    <li value="3">Run the following command: <br /><span style="font-family: 'Courier New';">nctl predict launch --local_model_location &lt;directory where you have cloned Tensorflow Serving&gt;/serving/tensorflow_serving/ servables/tensorflow/testdata  /saved_model_half_plus_two_cpu</span></li>
                                    <li value="4">Alternatively to step 3, you may want to save a trained model on input share, so it can be reused by other experiments/prediction instances. In order to to this, run these commands:<ol style="list-style-type: lower-alpha;"><li value="1">Use the mount command to mount NAUTA input folder to local machine. <br /> <span style="font-family: 'Courier New';">nctl mount</span><br />Use the resulting command printed by <span style="font-family: 'Courier New';">nctl mount</span>. After executing command printed by <span style="font-family: 'Courier New';">nctl mount</span> command, you will be able to access input share on your local file system.</li><li value="2">Now copy the <span style="font-family: 'Courier New';">saved_model_half_plus_two_cpu</span> model to input share: <br /><b>Execute</b>: <span style="font-family: 'Courier New';">cp -r &lt;directory where you have cloned Tensorflow Serving&gt;/serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_cpu &lt;directory where you have mounted /mnt/input share&gt;</span></li><li value="3">Run the following command. <br /><b>Execute</b>: <span style="font-family: 'Courier New';">nctl predict launch --model-location  /mnt/input/saved_model_half_plus_two_cpu</span></li></ol></li>
                                </ol>
                                <p><b>Note</b>: <span style="font-family: 'Courier New';">--model-name</span> can be passed optionally to <span style="font-family: 'Courier New';">nctl predict launch</span> command. If not provided, it assumes that model name is equal to the last directory in model location:<br /><br /> <span style="font-family: 'Courier New';">/mnt/input/home/trained_mnist_model</span> -&gt; <span style="font-family: 'Courier New';">trained_mnist_model</span></p>
                                <h3>Using a Streaming Inference Instance</h3>
                                <p>After running the <span style="font-family: 'Courier New';">predict launch</span> command, <span style="font-family: 'Courier New';">nctl</span> will create a streaming inference instance that can be used in multiple ways, as described below.</p>
                                <h3>Streaming Inference with <span style="font-family: 'Courier New';">nctl predict stream</span> Command</h3>
                                <p>The <span style="font-family: 'Courier New';">nctl predict stream</span> command allows performing inference on input data stored in JSON format. This method is convenient for manually testing a trained model and provides a simple way to get inference results. <span style="font-family: 'Courier New';">For saved_model_half_plus_two_cpu</span>, write the following input data and save it in <span style="font-family: 'Courier New';">inference-data.json</span> file:</p>
                                <p><span style="font-family: 'Courier New';">{"instances": [1.0, 2.0, 5.0]}</span>
                                </p>
                                <p>The model <span style="font-family: 'Courier New';">saved_model_half_plus_two_cpu</span> is a simple model: for given <span style="font-family: 'Courier New';">x</span> input value it predicts result of <span style="font-family: 'Courier New';">x/2 +2</span> operation. We've passed following inputs to the model: <span style="font-family: 'Courier New';">1.0</span>, <span style="font-family: 'Courier New';">2.0</span>, and <span style="font-family: 'Courier New';">5.0</span>, and so expected predictions results are <span style="font-family: 'Courier New';">2.5</span>, <span style="font-family: 'Courier New';">3.0</span>, and <span style="font-family: 'Courier New';">4.5</span>. In order to use that data for prediction, check the name of the running prediction instance with <span style="font-family: 'Courier New';">saved_model_half_plus_two_cpu</span> model (the name will be displayed after <span style="font-family: 'Courier New';">nctl predict launch</span> command executes; you can also use <span style="font-family: 'Courier New';">nctl predict list</span> command for listing running prediction instances). Then run following command:</p>
                                <p><span style="font-family: 'Courier New';">$ nctl predict stream --name &lt;prediction instance name&gt; --data inference-data.json</span>
                                </p>
                                <p>The following results will be produced:</p>
                                <p><span style="font-family: 'Courier New';">{ "predictions": [2.5, 3.0, 4.5] }</span>
                                </p>
                                <p>Tensorflow Serving exposes three different method verbs for getting inference results. Selecting the proper method verb depends on the used model and the expected results. Please refer to <a href="experiment.htm">https://www.tensorflow.org/serving/api_rest</a> for more detailed information. These method verbs are:</p>
                                <p>•	classify</p>
                                <p>•	regress</p>
                                <p>•	predict</p>
                                <p>By default, <span style="font-family: 'Courier New';">nctl predict stream</span> will use the <span style="font-family: 'Courier New';">PREDICT</span> method verb. You can change it by passing the <span style="font-family: 'Courier New';">--method-verb</span> parameter to the <span style="font-family: 'Courier New';">nctl predict stream</span> command, for example:</p>
                                <p><span style="font-family: 'Courier New';">nctl predict stream --name &lt;prediction instance name&gt; --data inference-data.json --method-verb CLASSIFY</span>
                                </p>
                                <h3>Streaming Inference with Tensorflow Serving REST API</h3>
                                <p>Another way to interact with a running prediction instance is to use Tensorflow Serving REST API. This approach could be useful for more sophisticated use cases, like integrating data-collecting scripts and applications with prediction instances.</p>
                                <p>The URL and authorization header for accessing Tensorflow Serving REST API will be shown after prediction instance is submitted, as in the example below.</p>
                                <p style="text-align: center;">
                                    <img src="../images/predict_launch_681x588.png" style="border-left-style: solid;border-left-width: 1px;border-left-color: ;border-right-style: solid;border-right-width: 1px;border-right-color: ;border-top-style: solid;border-top-width: 1px;border-top-color: ;border-bottom-style: solid;border-bottom-width: 1px;border-bottom-color: ;width: 681;height: 588;" />
                                </p>
                                <h3>Accessing the REST API with curl</h3>
                                <p>Here is an example of accessing REST API Using curl, with the following command:</p>
                                <p><span style="font-family: 'Courier New';">curl -k -X POST -d @inference-data.json -H 'Authorization: Bearer &lt;authorization token data&gt;' localhost:8501/v1/models/&lt;model_name, e.g. saved_model_half_plus_two_cpu&gt;:predict</span>
                                </p>
                                <h3>Using Port Forwarding</h3>
                                <p>Alternatively, the Kubernetes port forwarding mechanism may be used. You can create a port forwarding tunnel to the prediction instance with the following command:</p>
                                <p style="font-family: 'Courier New';">kubectl port-forward service/&lt;prediction instance name&gt;:8501</p>
                                <p>Or if you want to start a port forwarding tunnel in background, use this command:</p>
                                <p><span style="font-family: 'Courier New';">kubectl port-forward service/&lt;prediction instance name&gt; &lt;some local port number&gt;:8501 &amp;</span>
                                </p>
                                <p><b>Note</b>:&#160;The local port number of the tunnel you entered above ; it will be produced by <span style="font-family: 'Courier New';">kubectl port-forward</span> if you do not explicitly specify it.</p>
                                <p>Now you can access REST API on the following URL (example only).</p>
                                <p><span style="font-family: 'Courier New';">localhost:&lt;local tunnel port number&gt;/v1/models/&lt;model_name, e.g. saved_model_half_plus_two_cpu&gt;:&lt;method verb&gt;</span>
                                </p>
                                <h3>Example of Accessing REST API Using curl</h3>
                                <p><span style="font-family: 'Courier New';">curl -X POST -d @inference-data.json localhost:8501/v1/models/&lt;model_name, e.g. saved_model_half_plus_two_cpu&gt;:predict</span>
                                </p>
                                <h3>Streaming Inference with Tensorflow Serving gRPC API</h3>
                                <p>Another way to interact with running prediction instance is to use Tensorflow Serving gRPC. This approach could be useful for more sophisticated use cases, like integrating data collecting scripts/applications with prediction instances. It should provide better performance than REST API.</p>
                                <p>In order to access Tensorflow Serving gRPC API of running prediction instance, the Kubernetes port forwarding mechanism must be used. Create a port forwarding tunnel to a prediction instance with following command:</p>
                                <p><span style="font-family: 'Courier New';">kubectl port-forward service/&lt;prediction instance name&gt; :8500</span>
                                </p>
                                <p>Or if you want to start port forwarding tunnel in background:</p>
                                <p><span style="font-family: 'Courier New';">kubectl port-forward service/&lt;prediction instance name&gt; &lt;some local port number&gt;:8500 &amp;</span>
                                </p>
                                <p><b>Note</b>:&#160;Remember the local port number of tunnel you entered above ; it will be produced by kubectl port-forward if you do not explicitly specify it.</p>
                                <p>You can access the gRPC API using a dedicated client gRPC client (such as: <a href="experiment.htm">https://github.com/tensorflow/serving/blob/master/tensorflow_serving/example/mnist_client.py</a>).</p>
                                <p>Alternatively, use gRPC CLI client of your choice (such as: grpcc or polyglot) and connect to:</p>
                                <p><span style="font-family: 'Courier New';">localhost:&lt;local tunnel port number&gt;</span>
                                </p>
                                <h2>References</h2>
                                <p>•	<a href="getting_started.htm">https://www.tensorflow.org/serving/serving_basic</a></p>
                                <p>•	<a href="getting_started.htm">https://www.tensorflow.org/serving/docker</a></p>
                                <p>•	<a href="getting_started.htm">https://www.tensorflow.org/serving/api_rest</a></p>
                                <p>&#160;</p>
                            </div>
                        </div>
                    </section><a data-close="true"></a>
                </div>
            </div>
            <script>/* <![CDATA[ */$(document).foundation();/* ]]> */</script>
        </div>
    </body>
</html>